RNA From Cytology Samples To Diagnose Disease

ABSTRACT

The invention relates to methods and kits for detecting the likelihood that a subject has cancer, e.g., squamous cell carcinoma, by assaying the expression levels of tumor associated genes. More specifically, the expression levels of nucleic acids or proteins can be assayed in the tumor associated genes, e.g., beta-2 microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1). The expression levels compared to standards can be indicative of the likelihood a subject has squamous cell carcinoma. For example, over-expression of B2M and under-expression of CYP1B1 can be indicative of the likelihood a subject has squamous cell carcinoma. Also, over-expression of B2M and over-expression of CYP1B1 can be indicative of the likelihood a subject has a precancerous squamous cell disorder. The expression levels of B2M and CYP1B1 can also be repeatedly assayed to monitor the progression of a squamous cell neoplasia.

RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional PatentApplication Ser. No. 61/037,767, filed Mar. 19, 2008, entitled “RNA FromCytology Samples to Diagnose Disease” which is herein incorporated byreference.

FIELD OF THE INVENTION

The invention relates to methods and kits for detecting the likelihoodthat a subject has squamous cell carcinoma or neoplasia.

BACKGROUND OF THE INVENTION

Oral cancer can be any cancerous growth that is found in the mouth. Itcan arise as a primary lesion originating from any of the oral tissues.The most common form of oral cancer is oral squamous cell carcinoma,originating from the tissues that line the mouth and lips. Most oralcancers are malignant and can spread rapidly. In 2008, in the US alone,more than about 34,000 individuals will be diagnosed with oral cancer.Of these, 66% will be diagnosed with late stage three or four disease.

RNA expression analysis of oral keratinocytes can be used to detectearly stages of disease such as oral cancer or to monitor on-goingtreatment responses of the same or other oral diseases. A limitation isthe inability to obtain high quality RNA from oral tissue without usingbiopsies. While oral cytology cell samples can be obtained from patientsin a minimally invasive manner they have not been validated forquantitative analysis of RNA expression.

Obtaining patient RNA without surgery would be an ideal way tofacilitate large-scale genetic studies of cancer and simplify patientdiagnosis. Because of the accessibility of the oral and cervical mucosa,methods have been in place for some time to examine histologic andgenetic variations in normal and tumors cells. Very recently, methods toanalyze RNA from cells and fluids from these organs have been explored.Establishing the validity of these approaches for quantification of geneexpression remains an important goal.

Analysis of RNA in urine and saliva has the advantage of ease of use formarker discovery, but it has limitations because it does not provide adirect measure of gene expression in the tissue. It measures RNAs thatare stable extracellularly, identifying markers that correlate withdisease but are less likely to be informative about disease etiology.Potential problems exist. For example, the unknown contribution of RNAfrom dead and dying cells may not be readily assessed. Also, subtledifferences in investigator sampling can accentuate differences innumbers and types of cells isolated.

Accordingly, there exists a need for better methods and kits fordetecting the likelihood that a subject has squamous cell carcinoma orneoplasia. Accurate assay techniques for detecting or monitoring suchdisease states without resort to surgical biopsies would satisfy along-felt need in the art.

SUMMARY OF THE INVENTION

Methods and kits are disclosed for detecting the likelihood that asubject has cancer, e.g., squamous cell carcinoma, by assaying theexpression levels of tumor associated genes. More specifically, theexpression levels of nucleic acids or proteins can be assayed in thetumor associated genes, e.g., beta-2 microgobulin (B2M) and cytochromep450 1B1 (CYP1B1). The expression levels compared to standards can beindicative of the likelihood a subject has squamous cell carcinoma. Forexample, over-expression of B2M and under-expression of CYP1B1 can beindicative of the likelihood a subject has oral squamous cell carcinoma.Also, over-expression of B2M and over-expression of CYP1B1 can beindicative of the likelihood a subject has a precancerous oral squamouscell disorder. The expression levels of B2M and CYP1B1 can also berepeatedly assayed to monitor the progression of an squamous cellneoplasia.

In one aspect of the invention, a method for detecting the likelihoodthat a subject has squamous cell carcinoma comprises obtaining a brushcytology sample from a subject, extracting nucleic acids from cells inthe sample, and assaying the nucleic acids for expression levels ofnon-degraded nucleic acid sequences coding for production of beta-2microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1), whereinover-expression of the B2M gene compared to a standard, together withunder-expression of the CYP1B1 gene compared to a second standard isindicative of a likelihood that the subject has squamous cell carcinoma.

In another aspect of the invention brush cytology sampling is used toobtain squamous cells suitable for assays. The brush cytology instrumentor brush can have one or two cutting surfaces. Brushes with one surfacecan comprise a rod with perpendicular bristles. Brushes with twosurfaces can comprise a flat end of the brush and a circular border ofthe brush. Either surface can be used to obtain the specimen. To obtainthe brush cytology sample, firm pressure with a brush can be applied tothe area to be sampled. In some embodiments, a brush can be rotated inat least 20 brush strokes, where a single brush stroke is a forward tobackward/backward to forward, a side to side or circular movement toobtain the. sample. In some other embodiments, a first brush can berotated in two to five brush strokes, to prime the surface by removingexternal dead or dying cells and expose underlying layers, then thefirst brush is discarded. Then a second brush can be rotated in the samelocation in at least 20 brush strokes to obtain the sample.

In one embodiment of the invention, the method further comprisesamplifying and quantifying expression of the B2M gene and the CYP1B1genes by real time polymerase chain reaction (q-PCR) using primerscomplementary to an mRNA sequence of at least 15 bases found at least500 basepairs and preferably at least 1000 basepairs from the encoded 3′ends of the B2M and CYP1B1 mRNA transcripts. Amplifying expressionproducts at least 500 basepairs and preferably at least 1000 basepairsfrom the encoded 3′ ends of the transcripts, corresponding to thetranscriptional start site, substantially full length and non-degradednucleic acid sequences capable of producing proteins of increase, e.g.,beta-2 microgobulin (B2M) or cytochrome p450 1B1 (CYP1B1), can bedetected. Typically, non-degraded nucleic acid sequences are extractedfrom living cells taken as part of the sample. In contrast, degradednucleic acid sequences are typically extracted from dead cells or cellsundergoing apoptosis.

In another aspect, the invention is directed to a method for detectingthe likelihood that a subject has squamous cell carcinoma, comprisingdetecting beta-2 microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1)protein or nucleic acid expression levels in a sample from the subject,wherein over-expression of the B2M gene compared to a standard togetherwith under-expression of the CYP1B1 gene compared to a second standardis indicative of a likelihood that the subject has oral squamous cellcarcinoma. Moreover, another aspect of the invention is directed todetecting the likelihood that a subject has a precancerous squamous celldisorder, comprising detecting beta-2 microgobulin (B2M) and cytochromep450 1B1 (CYP1B1) protein or nucleic acid expression levels in a samplefrom the subject, wherein over-expression of the B2M gene compared to astandard together with over-expression of the CYP1B1 gene compared to asecond standard is indicative of a likelihood that the subject has aprecancerous oral squamous cell disorder.

In one aspect, a method for monitoring squamous cell neoplasia in ahuman subject over time, comprising obtaining a brush cytology samplefrom a subject at a first time, extracting nucleic acids from cells inthe sample, assaying said nucleic acids for the expression level ofgenes coding for the production of beta-2 microgobulin (B2M) andcytochrome p450 1B1 (CYP1B1), and repeating the steps of obtaining asample, extracting nucleic acids and assaying for expression levels ofB2M and CYP1B1 at a later time, wherein increased expression of the B2Mgene at a later time or decreased expression of the CYP1B1 gene at alater time is indicative of progression of neoplasia. In one embodiment,squamous cell neoplasia in a human subject can be monitored over time inresponse to a treatment. A sample can be obtained, nucleic acidsextracted from the sample, expression level of genes encoding for beta-2microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1) can be assayed, atreatment can be administered, wherein the treatment is a bioactiveagent that inhibits cytochrome p450 proteins, sampling from the subjectcan be repeated over time and the expression level of B2M and CYP1B1 ata later time is indicative of the response to the treatment.

In yet another aspect, the invention is directed to a kit for assessingthe presence of cancer in a sample comprising a pair of primers whichspecifically hybridize to at least one non-degraded nucleic acidsequences coding for production of beta-2 microglobulin (B2M) geneproduct or a cytochrome p450 1B1 (CYP1B1) gene product and reagents forreal-time polymerase chain reaction (q-PCR). In additional embodiments,the kit can comprise additional tools, reagents or instruction manuals.For example, the kit can comprise a brush for obtaining a brush cytologysample from a subject. Also, the kit can comprise a nucleic acidextraction reagent to isolate nucleic acids from a sample.

Further understanding of various aspects of the invention can beobtained by reference to the following detailed description inconjunction with the associated drawings, which are described brieflybelow.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flow chart depicting various steps in an embodiment of amethod of the invention to detect the likelihood a subject has oralsquamous cell carcinoma;

FIG. 2 is flow chart depicting various steps in an embodiment of amethod of the invention to detect the likelihood a subject has aprecancerous oral squamous cell disorder;

FIG. 3 is flow chart depicting various steps in an embodiment of amethod of the invention to monitor the progression of an oral squamouscell neoplasia over time;

FIG. 4 is flow chart depicting various steps in an exemplary embodimentof a method of the invention to detect the likelihood a subject has oralsquamous cell carcinoma;

FIG. 5 is flow chart depicting various steps in an exemplary embodimentof a method of the invention to detect the likelihood a subject has aprecancerous oral squamous cell disorder;

FIG. 6A shows hematoxylin and eosin-stained tissue sections from controlunexposed hamster oral tissue (floor of mouth and lateral border oftongue) (bar=200 lm), with one example of stratified squamous epithelium(SSE) labelled;

FIG. 6B shows hematoxylin and eosin-stained tissue sections after 33weeks of exposure to dibenzo[a,I]pyrene that reveal histopathologicchanges characteristic of oral squamous cell carcinoma (bar=200 lm);

FIG. 7A shows a bar graph comparing expression of the B2M gene in brushcytology samples from 13 animals. Each bar represents the relative mRNAlevel of one of three samples taken on consecutive weeks from oralsquamous cell carcinoma tumor in five hamsters and normal mucosa ofeight control hamsters. Shown is the mean of three PCR runs of a singlesample. For each animal the overall intraclass correlation (ICC) amongeach set of three measurements was calculated;

FIG. 7B shows a bar graph comparing expression of the CDK2AP1 gene inbrush cytology samples from 13 animals,

FIG. 7C shows a bar graph comparing expression of the CYP1B1 gene inbrush cytology samples from 13 animals;

FIG. 7D shows a bar graph comparing expression of the GSTP1 gene inbrush cytology samples from 13 animals;

FIG. 7E shows a bar graph comparing expression of the PECAM1 gene inbrush cytology samples from 13 animals;

FIG. 7F shows a bar graph comparing expression of the VEGF gene in brushcytology samples from 13 animals;

FIG. 8A shows a bar graph comparing measured B2M mRNA levels in brushcytology RNA samples vs. surgically excised (biopsy) tissue from 13animals. The mean mRNA levels for each tested gene were calculated forbrush cytology samples (black bars) vs. surgically removed tissue (whitebars) ±SEM. The values for the brush cytology cell mRNA were averagedover three separate brush cytology samples. The correlation coefficient(R) comparing the derived values from the two cell sources for eachhamster was derived;

FIG. 8B shows a bar graph comparing measured CDK2AP1 mRNA levels inbrush cytology RNA samples vs. surgically excised (biopsy) tissue from13 animals;

FIG. 8C shows a bar graph comparing measured CYP1B1 mRNA levels in brushcytology RNA samples vs. surgically excised (biopsy) tissue from 13animals;

FIG. 8D shows a bar graph comparing measured GSTP1 mRNA levels in brushcytology RNA samples vs. surgically excised (biopsy) tissue from 13animals;

FIG. 8E shows a bar graph comparing measured PECAM1mRNA levels in brushcytology RNA samples vs. surgically excised (biopsy) tissue from 13animals;

FIG. 8F shows a bar graph comparing measured VEGF mRNA levels in brushcytology RNA samples vs. surgically excised (biopsy) tissue from 13animals;

FIG. 9A depicts a brush oral cytology immunofluorescent staining of amucosal biopsy sample showed cytokeratin staining specifically in thecells of the epithelium (bar=10 lm, BM is basement membrane);

FIG. 9B depicts a brush cytology sample cells were highly enriched forcytokeratin staining; and

FIG. 9C shows that brush cytology sample RNA was enriched for epithelialmarkers CDH1 and CX-26) and depressed for non-epithelial cell markers(DES and VIM) vs. the biopsy sample RNA. RNA was from five controlhamsters.

DETAILED DESCRIPTION OF THE INVENTION

RNA analysis from brush oral cytology, on the other hand, has theadvantage that live cells can be isolated from a site at risk for adisease such as oral squamous cell carcinoma (OSCC). Early changes inthe disease progression that effect gene expression can be detected andbecause of the minimal invasiveness, the assay can be carried outrepeatedly.

Pilot studies from the literature demonstrate that the isolation of RNAfrom brush oral cytology is possible and that mRNA can be detected usingq-PCR or microarray analysis, but it is not clear how reliable themethod is and what is being measured. One study indicated that 10-20% ofthe oral brush cytology mucosal cells from humans were viable asisolated, while we saw somewhat higher numbers from hamsters. In theirhuman study, Spivack et al. saw a qualitative correlation of thedetectability of expression of a number of mRNAs in laser microdissectedlung tissue and brush cytology cells from the same patients. However,large inter-patient variability in mRNA quantitation was seen (up to10000-fold) and the source of this variation was not explored. Inanother pilot study, RNAs from brush cytology cervical cells werecompared to those from a surgically removed cervical tissue specimen byDNA microarray analysis, revealing that similar groups of genes wereexpressed above background.

The present invention relates, in part, to newly discovered correlationsbetween the expression of selected genes, in particular, beta-2microglobulin (B2M) and cytochrome p450 1B1 (CYP1B1) and the presence ofcancer, such as, squamous cell carcinoma, in a subject. The relativeexpression level of the genes, e.g., B2M and CYP1B1, has been found tobe indicative of squamous cell carcinoma in the subject and/ordiagnostic of the presence or potential presence of squamous cellcarcinoma in a subject. The invention features methods for detecting thelikelihood a subject has squamous cell carcinoma, and methods ofdetecting the likelihood a subject has a precancerous squamous celldisorder by assaying nucleic acids for relative expression levels of B2Mand CYP1B1 genes as compared to a standard.

The invention is also based, at least in part, on the identification ofgenes which are differentially expressed in samples from squamouscarcinoma cells compared to non-cancer cells. A panel of known genes wasscreened for differential expression patterns in oral brush cytologysamples (see Examples 1 and 2). Those genes with statisticallysignificant (p<0.01) differences between the diseased and normal tissueswere identified. This differential expression was observed either as adecrease in expression, or an increase in expression.

Accordingly, the present invention pertains to the analysis of B2Mand/or CYP1B1 genes, the corresponding mRNA transcripts, and the encodedpolypeptides, as an indication for the presence of or risk fordevelopment of, and the progression of squamous cell carcinoma.Overexpression of the B2M gene can be indicative of the presence ofdisease and a precancerous oral squamous cell disorder. Overexpressionof the CYP1B1 gene can also be indicative of the likelihood a subjecthas a precancerous oral squamous cell disorder, while theunderexpression of the CYP1B1 gene can be indicative the subject hasoral squamous cell carcinoma.

Detection of the presence or expression levels of non-degraded nucleicacid sequences, e.g., at least 500 basepairs and preferably at least1000 basepairs from the encoded 3′ ends of the B2M or CYP1B1 mRNAtranscripts, in nucleic acids can be performed using methods known inthe art. Typically, it can be convenient to assess the presence and/orquantity of MRNA or CDNA by real-time polymerase chain reaction (q-PCR)or quantitative-PCR (q-PCR), in which mRNA can be isolated from a cellor tissue sample, converted to cDNA using reverse transcriptase bymethods known in the art, hybridized with gene specific oligonucleotides(e.g., B2M or CYP1B1 primers), and amplified in the presence of probe ordiagnostic label. The label group can be a fluorescent compound. Otheruseful methods of mRNA detection and/or quantification include northernblot, gel electrophoresis, column chromatography, q-PCR, and othermethods known by one skilled in the art.

In another aspect, the invention provides a method for detecting thelikelihood that a subject has squamous carcinoma by assaying expressionlevel of B2M and CYP1B1 genes, whose quantity or expression level isassayed for the likelihood that a subject has squamous carcinoma (FIG.1). The genes, e.g., B2M and CYP1B1, are either increased or decreasedin expression level in the cancer tissue in a fashion that is eitherpositively or negatively indicative of the subject having squamous cellcarcinoma. In yet another aspect, the invention provides a method fordetecting the likelihood that a subject has a precancerous squamous celldisorder by assaying the expression levels of B2M and CYP1B1 genes (FIG.2). The genes are either increased or decreased in expression level thatcan be indicative that the subject has a precancerous squamous celldisorder.

In yet another aspect, the invention provides a method for monitoringsquamous cell neoplasia in a human subject over time by assaying theexpression level of B2M and CYP1B1 genes, whose expression level isassayed for the likelihood that a subject has squamous carcinoma (FIG.3).

The terms used in this invention adhere to the standard definitionsgenerally accepted by those having ordinary skill in the art. In caseany further explanation might be needed, some terms have been elucidatedbelow and throughout the application.

A “nucleic acid molecule” refers to the phosphate ester polymeric formof ribonucleosides (adenosine, guanosine, uridine or cytidine; “RNAmolecules”) or deoxyribonucleosides (deoxyadenosine, deoxyguanosine,deoxythymidine, or deoxycytidine; “DNA molecules”) in either singlestranded form, or a double-stranded helix. Double stranded DNA-DNA,DNA-RNA and RNA-RNA helices are possible. The term nucleic acidmolecule, and in particular DNA or RNA molecule, refers only to theprimary and secondary structure of the molecule, and does not limit itto any particular tertiary forms. Thus, this term includesdouble-stranded DNA found, inter alia, in linear or circular DNAmolecules (e.g., restriction fragments), plasmids, and chromosomes. Indiscussing the structure of particular double-stranded DNA molecules,sequences may be described herein according to the normal convention ofgiving only the sequence in the 5′ to 3′ direction along thenontranscribed strand of DNA (i.e., the strand having a sequencehomologous to the mRNA). A “recombinant DNA molecule” is a DNA moleculethat has undergone a molecular biological manipulation.

As used herein, the terms “polynucleotide,” “oligonucleotide” and“nucleic acid sequences” are used interchangeably, and include polymericforms of nucleotides of any length, either deoxyribonucleotides orribonucleotides, or analogs thereof. Polynucleotides can have anythree-dimensional structure, and can perform any function, known orunknown. The following are non-limiting examples of polynucleotides: agene or gene fragment, exons, introns, messenger RNA (mRNA), transferRNA, ribosomal RNA, ribozymes, complementary DNA (cDNA), recombinantpolynucleotides, branched polynucleotides, plasmids, vectors, isolatedDNA of any sequence, isolated RNA of any sequence, nucleic acid probes,and primers. A polynucleotide can comprise modified nucleotides, such asmethylated nucleotides and nucleotide analogs. If present, modificationsto the nucleotide structure can be imparted before or after assembly ofthe polymer. The sequence of nucleotides can be interrupted bynon-nucleotide components. A polynucleotide can be further modifiedafter polymerization, such as by conjugation with a labeling component.The term also includes both double- and single-stranded molecules.Unless otherwise specified or required, any embodiment of this inventionthat is a polynucleotide encompasses both the double-stranded form andeach of two complementary single-stranded forms known or predicted tomake up the double-stranded form.

A polynucleotide is composed of a specific sequence of four nucleotidebases: adenine (A); cytosine (C); guanine (G); thymine (T); and uracil(U) for guanine when the polynucleotide is RNA. This, the term“polynucleotide sequence” is the alphabetical representation of apolynucleotide molecule. This alphabetical representation can be inputinto databases in a computer having a central processing unit and usedfor bioinformatics applications such as functional genomics and homologysearching.

A “gene” includes a polynucleotide containing at least one open readingframe that is capable of encoding a particular polypeptide or proteinafter being transcribed and translated. Any of the polynucleotidesequences described herein can be used to identify larger fragments orfull-length coding sequences of the gene with which they are associated.Methods of isolating larger fragment sequences are known to those ofskill in the art, some of which are described herein. Previously knownand uncharacterized polymorphisms in beta-2 microgobulin (B2M) andcytochrome p450 1B1 (CYP1B1) genes are also included within thisinvention. In addition, alternative splicing products that producevariation in the mRNA expression pattern are also included.

A “gene product” includes an amino acid (e.g., peptide or polypeptide)generated when a gene is transcribed and translated.

The terms “tumor-associated genes” as used herein refers to a gene(s)found to be differentially expressed, either over-expressed orunder-expressed in cancer tissue and originally identified by theirdifferential expression in cancer cells compared to non-cancer cells.

The term “non-degraded” nucleic acid sequences as used herein refers tosubstantially full length nucleic acid sequences capable of producingproteins of interest, e.g., beta-2 microgobulin (B2M) or cytochrome p4501B1 (CYP1B1). Typically, non-degraded nucleic acid sequences areextracted from living cells taken as part of the sample. In contrast,“degraded nucleic acid sequences” are typically extracted from deadcells or cells undergoing apoptosis. Amplifying expression products atleast 500 basepairs and preferably at least 1000 basepairs from theencoded 3′ ends of the mRNA transcripts, corresponding to thetranscriptional start site, substantially full length and non-degradednucleic acid sequences capable of producing proteins of increase, e.g.,beta-2 microgobulin (B2M) or cytochrome p450 1B1 (CYP1B1), can bedetected. Alternatively, non-degraded nucleic acid sequences preferablefor the assay purposes disclosed herein are typically at least 50percent or more of the full-length gene and suitable primers can be usedto selectively amplify such nucleic acid sequences.

A “probe” when used in the context of polynucleotide manipulationincludes a reagent to detect a target present in a sample of interest byhybridizing or incorporation with the target. Usually, a probe willcomprise a label or a means by which a label can be attached orincorporated with the target. Suitable labels include, but are notlimited to fluorochromes, chemiluminescent compounds, dyes, andproteins, including enzymes.

A “primer” includes a short polynucleotide, generally with a free 3′-OHgroup that binds to a target or “template” present in a sample ofinterest by hybridizing with the target, and thereafter promotingpolymerization of a polynucleotide complementary to the target. A“polymerase chain reaction” (“PCR”) is a reaction in which replicatecopies are made of a target polynucleotide using a “pair of primers” or“set of primers” consisting of “upstream” and a “downstream” primer, anda catalyst of polymerization, such as a DNA polymerase, and typically athermally-stable polymerase enzyme. Methods for PCR are well known inthe art, and are taught, for example, in MacPherson et al. , IRL Pressat Oxford University Press (1991)). “Quantitative PCR” (“q-PCR”), alsoreferred herein as real-time PCR (q-PCR), is based on PCR to amplify andsimultaneously quantify a target DNA molecule. All processes ofproducing replicate copies of a polynucleotide, such as PCR or genecloning, are collectively referred to herein as “replication”. A primercan also be used as a probe in hybridization reactions, such as Southernor Northern blot analyses (see, e.g., Sambrook, J., Fritsh, E. F., andManiatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., ColdSpring Harbor Laboratory, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y., 1989).

The term “cDNAs” includes complementary DNA, that is mRNA moleculespresent in a cell or organism generated into cDNA with an enzyme such asreverse transcriptase. A “cDNA library” includes a collection of MRNAmolecules present in a cell or organism, converted into cDNA moleculeswith the enzyme reverse transcriptase, then inserted into “vectors”(other DNA molecules that can continue to replicate after addition offoreign DNA). Exemplary vectors for libraries include bacteriophage,viruses that infect bacteria (e.g., lambda phage). The library can thenbe probed for the specific cDNA (and thus mRNA) of interest.

A DNA “coding sequence” is a double-stranded DNA sequence which istranscribed and translated into a polypeptide in a cell in vitro or invivo when placed under the control of appropriate regulatory sequences.The boundaries of the coding sequence are determined by a start codon atthe 5′ (amino) terminus and a translation stop codon at the 3′(carboxyl) terminus. A coding sequence can include, but is not limitedto, prokaryotic sequences, cDNA from eukaryotic mRNA, genomic DNAsequences from eukaryotic (e.g., mammalian) DNA, and even synthetic DNAsequences. If the coding sequence is intended for expression in aeukaryotic cell, a polyadenylation signal and transcription terminationsequence can usually be located 3′ to the coding sequence.

Transcriptional and translational control sequences are DNA regulatorysequences, such as promoters, enhancers, terminators, and the like, thatprovide for the expression of a coding sequence in a host cell. Ineukaryotic cells, polyadenylation signals are control sequences. Varioussplice acceptor sites can be necessary for RNA splicing and can beincluded herein within the definition of “control sequences.” Some suchsequences also play a role in the abundance and stage-specificity ofgene expression.

As used herein, “expression” includes the process by whichpolynucleotides are transcribed into mRNA and translated into peptides,polypeptides, or proteins. If the polynucleotide is derived from genomicDNA, expression may include splicing of the mRNA, if an appropriateeukaryotic host is selected. Regulatory elements required for expressioncan include promoter sequences to bind RNA polymerase and transcriptioninitiation sequences for ribosome binding. For example, a bacterialexpression vector includes a promoter such as the lac promoter and fortranscription initiation the Shine-Dalgarno sequence and the start codonAUG (Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: ALaboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y., 1989). Similarly, aeukaryotic expression vector can include a heterologous or homologouspromoter for RNA polymerase II, a downstream polyadenylation signal, thestart codon AUG, and a termination codon for detachment of the ribosome.Such vectors can be obtained commercially or assembled by the sequencesdescribed in methods well known in the art, for example, the methodsdescribed below for constructing vectors in general.

“Differentially expressed”, as applied to a gene, includes thedifferential production of mRNA transcribed from a gene or a proteinproduct encoded by the gene. A differentially expressed gene may beoverexpressed or underexpressed as compared to the expression level of anormal, control cell or standard. In one aspect, it includes adifferential that can be 1.5 times, preferably 2 times or preferablygreater than 2 times higher or lower than the expression level detectedin a control sample. The term “differentially expressed” can alsoinclude nucleotide sequences in a cell or tissue which are expressedwhere silent in a control cell or not expressed where expressed in acontrol cell.

The term “polypeptide” includes a compound of two or more subunit aminoacids. The subunits can be linked by peptide bonds. In anotherembodiment, the subunit can be linked by other bonds, e.g., ester,ether, etc. As used herein the term “amino acid” includes either naturaland/or unnatural or synthetic amino acids, including glycine and boththe D or L optical isomers. A peptide of three or more amino acids cancommonly be referred to as an oligopeptide. Peptide chains of greaterthan three or more amino acids can be referred to as a polypeptide or aprotein.

“Hybridization” includes a reaction in which one or more polynucleotidesreact to form a complex that can be stabilized via hydrogen bondingbetween the bases of the nucleotide residues. The hydrogen bonding canoccur by Watson-Crick base pairing, Hoogstein binding, or in any othersequence-specific manner. The complex can comprise two strands forming aduplex structure, three or more strands forming a multi-strandedcomplex, a single self-hybridizing strand, or any combination of these.A hybridization reaction can constitute a step in a more extensiveprocess, such as the initiation of a PCR reaction, or the enzymaticcleavage of a polynucleotide by a ribozyme.

A nucleic acid molecule can be “hybridizable” to another nucleic acidmolecule, such as a cDNA, genomic DNA, or RNA, when a single strandedform of the nucleic acid molecule can anneal to the other nucleic acidmolecule under the appropriate conditions of temperature and solutionionic strength (see Sambrook et al., 1989, supra). The conditions oftemperature and ionic strength determine the “stringency” of thehybridization. Hybridization requires that the two nucleic acids containcomplementary sequences, although depending on the stringency of thehybridization, mismatches between bases are possible. The appropriatestringency for hybridizing nucleic acids depends on the length of thenucleic acids and the degree of complementation, variables well known inthe art. Preferably a minimum length for a hybridizable nucleic acid canbe at least about 10 nucleotides; more preferably at least about 15nucleotides.

Hybridization reactions can be performed under conditions of different“stringency”. The stringency of a hybridization reaction includes thedifficulty with which any two nucleic acid molecules can hybridize toone another. Under stringent conditions, nucleic acid molecules at least60%, 65%, 70%, 75% identical to each other remain hybridized to eachother, whereas molecules with low percent identity cannot remainhybridized.

When hybridization occurs in an antiparallel configuration between twosingle-stranded polynucleotides, the reaction is called “annealing” andthose polynucleotides are described as “complementary”. Adouble-stranded polynucleotide can be “complementary” or “homologous” toanother polynucleotide, if hybridization can occur between one of thestrands of the first polynucleotide and the second. “Complementarity” or“homology” (the degree that one polynucleotide is complementary withanother) is quantifiable in terms of the proportion of bases in opposingstrands that are expected to hydrogen bond with each other, according togenerally accepted base-pairing rules.

An “antibody” includes an immunoglobulin molecule capable of binding anepitope present on an antigen. As used herein, the term encompasses notonly intact immunoglobulin molecules such as monoclonal and polyclonalantibodies, but also anti-idotypic antibodies, mutants, fragments,fusion proteins, bi-specific antibodies, humanized proteins, andmodifications of the immunoglobulin molecule that comprises an antigenrecognition site of the required specificity.

The term “cancerous” as used herein is intended to refer to any abnormalcells that divide without control characterized by the proliferation ofanaplastic cells that can invade surrounding tissues and metastasize tonew body sites.

The term “oral cancer” as used herein refers to any cancerous tissuegrowth located in the mouth. It can arise as a primary lesionoriginating in any of the oral tissues, by metastasis from a distantsite of origin, or by extension from a neighboring anatomic structure.Oral cancers can originate in any of the tissues of the mouth. The mostcommon oral cancer is squamous cell carcinoma, originating in thetissues that line the mouth and lips. Oral or mouth cancer most commonlyinvolves the tissue of the lips or the tongue. Oral cancer can alsooccur on the floor of the mouth, cheek lining, gingiva (gums), or thepalate (roof of the mouth). Many oral cancers can be malignant and canspread rapidly. Oral cells can include, but are not limited to,pseudostratified epithelium, columnar epithelium and a variety ofsquamous epithelium: keratinized, non-keratinized and stratified.

The terms “squamous cell carcinoma” refer to a type of cancer that canoccur in a variety of organs, including, but not limited to: lips, skin,mouth, nose, esophagus, urinary bladder, prostate, lungs, vagina andcervix.

The term “subject” refers to any living organism. The term subjectcomprises, but is not limited to, humans, nonhuman primates such aschimpanzees and other apes and monkey species; farm animals such ascattle, sheep, pigs, goats and horses; domestic mammals such as dogs andcats; laboratory animals including rodents such as mice, rats and guineapigs, and the like. The term does not denote a particular age or sex.Thus, adult and newborn subjects, as well as fetuses, whether male orfemale, are intended to be covered. In preferred embodiments, thesubject is a mammal, including humans and non-human mammals. In a morepreferred embodiment, the subject is a mammal. In the most preferredembodiment, the subject is a human.

The terms “sample,” “sample from a subject” and “extracted sample” asused herein refer to a small quantity of tissue from a subject, whichcan be obtained, e.g., by employing methods known in the art. Such atissue sample, e.g., brush cytology sample, can contain cancer cells,non-cancer cells or both. The term sample comprises, but is not limitedto, oral tissues, oral cells from the mouth, lips, tongue, cheek lining,gingiva, palate, skin, nose, esophagus, urinary bladder, prostate,lungs, vagina and cervix of a subject.

The term “standard” as used herein refers to a control sample. The“standard” expression levels can be detected, for example, in non-cancersamples, normal subjects without cancer or untreated samples. The“standard” expression level can also refer to nucleic acid expressionlevels or protein levels present in non-cancer samples, normal subjectswithout cancer or untreated samples. Standards can provide a control orcomparison for determining the outcome of the experiment. Internal“standard” refers to an experimental optimal control to determine theconsistency of an experiment or set of experiments. An example ofinternal standards can be potential housekeeping genes identified ontheir constant expression in many tissues or on consistent levels innormal and tumor tissue.

Various aspects of the invention are described in further detail in thefollowing subsections:

I. Beta-2 Microglobulin (B2M)

Beta-2 microglobulin (B2M) (NM_(—)004048) (SEQ ID.: 1) is a component ofthe major histocompatibility complex (MHC) class 1 molecules, which arepresent on almost all nucleated cells of the body. B2M lies lateral tothe alpha3 chain on the cell surface and lacks a transmembrane domain.It interacts with the alpha chains and class 1-like molecules, which areimportant for antigen presentation.

Beta-2-microglobulin has been found in the serum of normal individualsand in the urine in elevated amounts in patients with Wilson disease,cadmium poisoning, and other conditions leading to renal tubulardysfunction.

Previous studies have found that some tumors lack cell surfaceexpression of HLA class 1 molecules and this can be one mechanism bywhich tumor cells escape immune recognition by cytotoxic T cells. Insome cases, tumor escape is due to loss of the heavy chain surfaceexpression encoded by the HLA-A, -B, and -C genes; in other cases,defects in expression of the B2M gene for the light chain can beresponsible.

The Daudi lymphoblastoid cell line, derived from a patient with Burkittlymphoma and lacking both HLA antigens and beta-2 microglobulin, failsto express HLA class 1 molecules because of a specific defect in the B2Mcomponent. In the human melanoma cell line FO-1, it was found that thelack of expression of HLA class 1 antigens was the result of a defect inthe B2M gene: a deletion of the first exon of the 5-prime flankingregion and of a segment of the first intron. Analyses usingsingle-strand conformation polymorphism (SSCP) analysis to screen aseries of 37 established colorectal cell lines, 22 fresh tumor samples,and 22 normal DNA samples for mutations in the B2M gene, found mutationsin 6 of 7 colorectal cell lines and 1 of 22 fresh tumors, whereas nomutations were detected in the normal DNA samples. Sequencing of thesemutations showed that an 8-bp CT repeat in the leader peptide sequencewas particularly variable, since 3 of the cell lines and 1 fresh tumorsample had deletions in this region. In 2 related colorectal cell lines,DLD-1 and HCT-15, 2 similar mutations were identified. Expression ofbeta-2-microglobulin was examined using a series of monoclonalantibodies in an ELISA system and reduced expression was correlated witha mutation in 1 allele of the B2M gene, whereas loss of expression wasseen in instances where a line was homozygous for a mutation orheterozygous for 2 mutations.

The present invention provides, in part, a method to detect expressionlevel changes in tumor-associated genes, such as changes in B2M geneexpression levels, in brush cytology samples. In one aspect of theinvention, the nucleic acid expression level, e.g., increasedexpression, of the B2M gene is indicative of a likelihood that a subjecthas squamous cell carcinoma (FIG. 1). In another aspect of theinvention, protein or nucleic acid expression level, e.g., increasedexpression, of B2M is compared to a standard and is indicative of alikelihood that the subject has squamous cell carcinoma (FIG. 4). In yetanother aspect of the invention, protein or nucleic acid expressionlevels, e.g., increased expression, of the B2M gene compared to astandard is indicative of a likelihood that the subject has aprecancerous squamous cell disorder (FIG. 5). In one aspect of theinvention, nucleic acids expression level of B2M is assayed over time,at repeated intervals, and expression level, e.g., increased expression,of the B2M gene is indicative of progression of neoplasia (FIG. 3). Inanother aspect of the invention, nucleic acids expression level of B2Mis compared to a standard and over-expression of B2M is indicative of alikelihood that the subject has a precancerous squamous cell disorder(FIG. 2). In yet another aspect of the invention, a kit is provided forassessing the presence of cancer in a sample comprising a pair ofprimers that specifically hybridize to at least one non-degraded B2Mnucleic acid sequences and reagents for real-time PCR.

II. Cvtochrome P450 Proteins

Cytochrome p450 proteins are a large and diverse family of hemoproteinsand monooxygenases which catalyze reactions involved in drug metabolismand synthesis of cholesterol, steroids and other lipids and responsiblefor the phase 1 metabolism of a wide range of structurally diversesubstrates by inserting 1 atom of atmospheric oxygen into the substratemolecule, thereby creating a new functional group (e.g., —OH, —NH2,—COOH). Some well known family members include: cytochrome p450 andcytochrome p450 1A1 (CYP1A1). While less studied and more recentlydiscovered, cytochrome p450 1B1 (CYP1B1) (NM_(—)000104)(SEQ ID.: 2) alsobelongs to the cytochrome 450 superfamily of proteins. CYP1B1 wasoriginally identified in 1994 through its homology to other identifiedfamily members, such as CYP1A1 with 44% identity. Despite thesimilarity, the two enzymes have very different catalytic efficienciesand metabolites when incubated with common substrates. CYP1B1 has alsobeen found to be regulated by the aryl hydrocarbon receptor, a ligandactivated transcription factor, and is expressed in many normal humantissues.

Recently CYP1B1 has been shown to be important in fetal development,with mutations linked to a form of primary congenital glaucoma.Screening for the presence of coding sequence changes in the CYP1B1 geneidentified 3 different truncating mutations: a 13-bp deletion found in 1consanguineous and 1 nonconsanguineous family; a single cytosineinsertion observed in another 2 consanguineous families; and a largedeletion found in an additional consanguineous family. In addition, aG-to-C transversion at nucleotide 1640 of the CYP1B1 coding sequence wasfound that caused a val432-to-leu amino acid substitution. This changecreated an EcoR57 restriction site, thus providing a rapid screeningmethod. Heterozygosity for the val432-to-leu change was found in 51.4%of 70 normal individuals. This amino acid change was not in that part ofCYP1B1 that represented conserved sequences, and both valine and leucineare neutral and hydrophobic. Their very similar aliphatic side groupsdiffer by a single —CH2 group. Therefore, this change appeared torepresent a common amino acid polymorphism that is not related to theprimary congenital glaucoma phenotype. However the finding was notunexpected, as a link between members of this superfamily and theprocesses of growth and differentiation had been postulated previously.They speculated that CYP1B1 participates in the metabolism of anas-yet-unknown biologically active molecule that is a participant in eyedevelopment.

The present invention generally provides a method that detectstumor-associated changes in the expression level in genes, such aschanges in CYP1B1 expression levels, in brush cytology samples. In oneaspect of the invention, the nucleic acid expression level, e.g.,decreased expression, of the CYP1B1 gene is indicative of a likelihoodthat a subject has squamous cell carcinoma (FIG. 1). In another aspectof the invention, protein or nucleic acid expression level, e.g.,decreased expression, of CYP1B1 is compared to a standard and isindicative of a likelihood that the subject has squamous cell carcinoma(FIG. 4). In yet another aspect of the invention, protein or nucleicacid expression levels, e.g., increased expression, of the CYP1B1 genecompared to a standard is indicative of a likelihood that the subjecthas a precancerous squamous cell disorder (FIG. 5). In one aspect of theinvention, nucleic acid expression level of CYP1B1 can be monitored overtime, at repeated intervals, and the expression level, e.g., decreasedexpression, of the CYP1B1 gene is indicative of progression of squamouscell neoplasia (FIG. 3). In another aspect of the invention, nucleicacids expression level of CYP1B1 is compared to a standard andover-expression of CYP1B1 is indicative of a likelihood that the subjecthas a precancerous squamous cell disorder (FIG. 2). In yet anotheraspect of the invention, a kit is provided for assessing the presence ofcancer in a sample comprising a pair of primers that specificallyhybridize to at least one non-degraded CYP1B1 nucleic acid sequences andreagents for real-time RT-PCR.

Another aspect of the invention relates to inhibiting CYP1B1, and othercytochrome p450 family members, such as CYP1A1, as a method to inhibitcarcinogenesis. Bioactive agents have been characterized to inhibitcytochrome p450-metabolism, and related family members-metabolism, ofcertain medications leading to increased bioavailability. Many of thesebioactive agents are naturally occurring and can be found in grapefruitjuice and other fruit juices. Some examples can include, but are notlimited to, bergamottin, dihydroxybergamottin, geraniol and resveratrol(a phytoalexin). In another aspect of the invention, administeringinhibitors of cytochrome p450 can be useful in treating or inhibitingsquamous cell neoplasia. The inhibitors can be a bioactive agent thatinhibits cytochrome p450 proteins and at least one of cytochrome p4501B1 (CYP1B1), cytochrome p450 1A1 (CYP1A1) and combinations thereof.

III. Predictive Medicine

The present invention pertains to the field of predictive medicine inwhich diagnostic assays, prognostic assays, pharmacogenetics andmonitoring clinical trials are used for prognostic (predictive) purposesto thereby detect a precancerous, cancerous or progression of a squamouscell cancer. Accordingly, one aspect of the present invention relates todiagnostic assays for detecting gene expression of nucleic acid and/orprotein, in the context of a sample (e.g., brush cytology sample) tothereby detect the likelihood that a subject has a precancerous squamouscell disorder, has squamous cell carcinoma, or to monitor theprogression of a squamous cell neoplasia, associated with increased ordecreased nucleic acid and/or protein expression.

1. Harvesting the Sample

In one aspect of the invention, the sample can be a biopsy sample or asmall number of cells or a tissue sample removed for processing. Commonexamples of biopsy methods can include, but are not limited to, brushcytology, core needle biopsy, surgical biopsy, punch biopsy, shavebiopsy, incisional/excisional biopsy and curettage biopsy.

A brush cytology method can utilize a brush to obtain a completetransepithelial specimen with cellular representation from each of thethree layers: the basal, intermediate, and superficial layers. Unlikesome cytology instruments, which collect only exfoliated superficialcells, the brush cytology sample penetrates to the basement membrane,removing tissue from all three epithelial layers of the mucosa. Thebrush cytology can be performed with or without topical or localanesthetic. The brush cytology instrument or brush can have one or twocutting surfaces. Brushes with one surface can comprise a rod withperpendicular bristles. Brushes with two surfaces can comprise a flatend of the brush and a circular border of the brush. Either surface canbe used to obtain the specimen.

Brush cytology samples can be utilized to routinely detect precancerousdisorders and carcinomas. The diagnosis of a cancer can be, accordingly,made when a lesion is suspicious enough that it causes a healthpractitioner or other person skilled in the art to refer the lesion forfurther analysis. Thus, the brush cytology can be a method of detectinga precancerous squamous cell disorder, which can prevent the cancer fromdeveloping further, and it can be a method of identifying unsuspectedcancers at early and treatable stages.

The brush cytology sample can provide a health practitioner or otherperson skilled in the art with a diagnostic screening test. In oneaspect of the invention, a brush cytology sample can be obtained. Priorto obtaining the sample, it is preferable to rinse the subjects mouthwith physiologic saline (pH 7.4) to remove any foreign debris that canbe collected during the sample harvest. The mouth rinse can be salinesolution or any commercially available mouth wash. Firm pressure with abrush can be applied to the area to be sampled. In some embodiments, abrush can be rotated in at least 20 brush strokes, where a single brushstroke is a forward to backward/backward to forward, a side to side orcircular movement to obtain the sample. In some other embodiments, afirst brush can be rotated in two to five brush strokes, to prime thesurface, then the first brush is discarded. Then a second brush can berotated in the same location in at least 20 brush strokes to obtain thesample. Little to no bleeding should result after the sample harvest. Inanother embodiment, the brush cytology sample can comprise squamouscells.

2. Diagnostic Assays

An exemplary method for detecting the presence or absence of nucleicacid or protein of the invention in a biological sample involvesobtaining a sample from a subject, e.g., brush cytology sample, assayingthe expression level (e.g., mRNA, cDNA or protein) of genes (e.g., B2Mand CYP1B1) and comparing the expression levels to a standard to detectthe likelihood the subject has a precancerous squamous cell disorder,squamous cell carcinoma or to monitor the progression of a neoplasia. Apreferred method for detecting expression level of messenger ribonucleicacid (mRNA) or complementary deoxyribonucleic acid (cDNA) can useamplification and quantification of specific nucleic acids. Suchpolymerase chain reaction (PCR) methods can be referred to as:quantitative PCR (q-PCR), real-time PCR (q-PCR) and quantitativereal-time PCR, see also U.S. Pat. No. 6,171,785, which can be modifiedand adapted for use by methods known to those of ordinary skill in theart.

Primers based on the nucleotide sequence of the genes of the inventioncan be used to detect transcripts corresponding to the gene(s) of theinvention. In some embodiments, a primer pair can be designed byutilizing primer design software, such as GenScript, Primer3, PRIDE andPrimer Express. Commercial primers are also available for purchasecorresponding to multiple locations throughout the gene. In an exemplaryembodiment, the primers can be complementary to an mRNA sequence of atleast 15 bases found at least 500 basepairs and preferably at least 1000basepairs from the encoded 3′ ends of the transcripts, corresponding tothe transcriptional start site. By specifying at least 500 basepairs andpreferably at least 1000 basepairs from the encoded 3′ ends of thetranscripts for amplification, the q-PCR can be biased toward detectingthe expression levels of non-degraded mRNA without interference ofdegraded mRNA that can be extracted from dead cells or cells undergoingapoptosis.

Another embodiment for detecting RNA or DNA corresponding to a gene orprotein of the invention can be with the use of a labeled nucleic acidprobe capable of hybridizing to a mRNA or cDNA of the invention. A widevariety of conventional techniques are available, including massspectrometry, chromatographic separations, 2-D gel separations, bindingassays (e.g., immunoassays), competitive inhibition assays, one- andtwo-dimensional gels and sandwiched ELISA. Typical methodologies for RNAdetection include RNA extraction from a cell or tissue sample, followedby hybridization of a labeled probe, (e.g., a complementarypolynucleotide) specific for the target RNA to the extracted RNA, anddetection of the probe (e.g., Northern blotting), direct sequencing, gelelectrophoresis, column chromatography, and quantitative PCR.

The term “sample” is intended to include tissues, cells and biologicalsamples isolated from a subject (e.g., brush cytology sample), as wellas tissues, cells and fluids present within a subject. That is, thedetection method of the invention can be used to detect mRNA, protein,or cDNA in a sample in vitro as well as mRNA or protein in vivo. Forexample, in vitro techniques for detection of mRNA can include PCR,q-PCR, northern hybridizations and in situ hybridizations. In vitrotechniques for detection of protein can include enzyme linkedimmunosorbent assays (ELISAs), western blots, immunoprecipitations andimmunofluorescence. In vitro techniques for detection of cDNA caninclude Southern hybridizations, PCR, q-PCR. Furthermore, in vivotechniques for detection of protein can include introducing into asubject a labeled antibody. For example, the antibody can be labeledwith a radioactive label whose presence and location in a subject can bedetected by standard imaging techniques.

In one aspect of the invention, methods for detecting the likelihoodthat a subject has squamous cell carcinoma can involve obtaining asample (e.g., brush cytology sample) from a subject, extracting nucleicacids from the sample, mRNA, or generating to cDNA from mRNA, assayingthe nucleic assaying the nucleic acids for expression level ofnon-degraded nucleic acid sequences coding for production of beta-2microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1) and whereinover-expression of the B2M gene compared to a standard, together withunder-expression of the CYP1B1 gene compared to a second standard isindicative of a likelihood that the subject has squamous cell carcinoma.Examples of a standard can be, but are not limited to, a non-cancercells sample, brush cytology sample from a control subject and normalcells.

In another aspect of the invention, the methods can involve obtaining acontrol sample (e.g., non-cancer cells sample) from a subject,extracting nucleic acids from the sample, mRNA, or generating to cDNAfrom mRNA, assaying the nucleic acids for expression level ofnon-degraded nucleic acid sequences coding for production of beta-2microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1) and whereinover-expression of the B2M gene compared to a standard together withover-expression of the CYP1B1 gene compared to a second standard isindicative of a likelihood that the subject has a precancerous squamouscell disorder.

The invention also encompasses kits for detecting the presence ofexpression of the genes, B2M and CYP1B1, in a sample. For example, thekit can comprise a pair of primers which specifically hybridize to atleast one non-degraded nucleic acid sequences coding for production ofbeta-2 microglobulin (B2M) gene or a cytochrome p450 1B1 (CYP1B1) geneand reagents for real-time polyrnerase chain reaction (q-PCR). The kitcan further comprise a brush to obtain a brush cytology sample andnucleic acid extraction reagents. Furthermore, the kit can compriseinstructions for using the kit to detect protein or nucleic acids.

In certain embodiments, detection of the expression levels can involvethe use of a probe/primer in a polymerase chain reaction (PCR) (see,e.g., U.S. Pat. Nos. 4,683,195 and 4,683,202), such as PCR or q-PCR.This method can include the steps of collecting a sample of cells from asubject (such as a brush cytology sample), extracting nucleic acids(e.g., cDNA generated from mRNA, mRNA or both) from the cells of thesample, contacting the nucleic acid sample with one or more primerswhich specifically hybridize to a gene under conditions such thathybridization and amplification of the gene (if present) occurs, anddetecting the expression level of an amplification product, andcomparing the expression level to a standard.

In other embodiments, intensity assessment in electrophoretic mobilitycan be used to identify expression level of genes or genes encoding aprotein of the invention. For example, amplification of the cDNAgenerated from the mRNA can be performed and the reaction product can bemeasured and quantified by electrophoresis.

In one aspect of the invention, the results from assaying expressionlevels of tumor-associated genes, such as B2M and CYP1B1, can influencea treatment prescribed by a health care practitioner or other person ofknown skill in the art. Based on the analyzed expression levels of thetumor-associated genes, such as B2M and CYP1B1, additional assessmentscan be made to determine a treatment. The type of treatment options canbe determined by those skilled in cancers. In some embodiments, repeatedsampling can be done to monitor the progression of the squamous cellneoplasia over time when abberrant expression of tumor-associated genesis detected in initial assays. Some treatments can also includeadministering bioactive agents that act as inhibitors of cytochrome p450family members. These inhibitors can potentially inhibit theoverexpression of cytochrome p450 1B1 that is demonstrated in FIGS. 7Cand 8C. Inhibitors can also be administered to treat a squamous cellneoplasia by inhibiting cytochrome p450.

3. Monitoring of the Progression of Neoplasia

Monitoring the cancer, e.g., squamous cell neoplasia, in a subject overtime by assessing the expression of genes (e.g., B2M and CYP1B1) canmonitor the progression of the squamous cell neoplasia. For example, theprogression of squamous cell neoplasia over time can comprise anincrease or decrease of gene expression levels or protein levelsindicative of progression or inhibition of the neoplasia. Alternatively,the effectiveness of a treatment or the influence of agents (e.g.,drugs) on the squamous cell neoplasia, can increase or decrease geneexpression levels or protein levels. In such clinical trials, theexpression levels of a gene or genes can be used as a “read out” of theprogression or inhibition of the neoplasia.

For example, and not by way of limitation, genes, including genes of theinvention and proteins encoded by the genes, that are altered bytreatment with an agent (e.g., compound, drug or small molecule) can beidentified. Thus, to study the effect of agents on gene-associateddisorders (e.g., squamous cell carcinoma), for example, in a clinicaltrial, samples can be obtained and nucleic acids or proteins can beextracted and assayed for expression levels. The expression levels canbe assayed by q-PCR, as described herein, or alternatively by measuringthe amount of nucleic acid or protein produced, by one of the methods asdescribed herein. In this way, the expression levels can be indicativeof the physiological response of the neoplasia to the agent.Accordingly, the expression levels can be assayed before, and at variouspoints during treatment with the agent.

In a preferred embodiment, the present invention provides a method formonitoring squamous cell neoplasia in a human subject over timeincluding the steps of (i) obtaining a brush cytology sample from asubject at a first time; (ii) extracting nucleic acids from cells in thesample; (iii) assaying said nucleic acids for the expression level ofgenes coding for the production of beta-2 microgobulin (B2M) andcytochrome p450 1B1 (CYP1B1); and (iv) repeating the steps of obtaininga sample, extracting nucleic acids and assaying for expression levels ofB2M and CYP1B1 at a later time, wherein increased expression of the B2Mgene at a later time or decreased expression of the CYP1B1 gene at alater time is indicative of progression of neoplasia.

In another embodiment, squamous cell neoplasia in a human subject can bemonitored over time in response to a treatment. A sample can beobtained, nucleic acids extracted from the sample, expression level ofgenes encoding for beta-2 microgobulin (B2M) and cytochrome p450 1B1(CYP1B1) can be assayed, a treatment can be administered, wherein thetreatment is a bioactive agent that inhibits cytochrome p450 proteins,sampling from the subject can be repeated over time and the expressionlevel of B2M and CYP1B1 at a later time is indicative of the response tothe treatment.

4. Kits for Detecting Cancer

The present invention also provides a kit that can be used in the abovemethods. A kit for assessing cancer in a sample of the present inventionincludes a means of detecting the expression levels of beta-2microglobulin (B2M) gene or a cytochrome p450 1B1 (CYP1B1) genes in asample. The present kit for cancer can include reagents used to make adiagnosis of cancer. Also, the present kit for cancer can comprisecomponents used in publicly known kits, except that a means of detectingthe expression level of genes associated with cancer (e.g., B2M andCYP1B1). Further, with the use of the kit for cancer, it can be possibleto diagnose a subject as having cancer. Examples of cancer include, butare not limited to, squamous cell carcinoma. The kit can also be used tomonitor progression of squamous cell neoplasias, as described above.

Herein, examples of detecting the presence of cancer by assayingexpression levels cancer associated genes, can comprise:

(1) a pair of primers which specifically hybridize to at least onenon-degraded nucleic acid sequences coding for production of beta-2microglobulin (B2M) gene or a cytochrome p450 1B1 (CYP1B1) gene; and

(2) reagents for real-time polymerase chain reaction (q-PCR).

In additional embodiments the kit can comprise additional tools,reagents or instruction manuals. For example, the kit can comprisereagents for cDNA synthesis, a brush for obtaining a brush oral cytologysample from a subject. Also, the kit can comprise a nucleic acidextraction reagent to isolate nucleic acids from a sample.

In one embodiment, the kit can be a diagnostic kit for use in testing asample. The kit can comprise one or more suitable pairs of primers forsimultaneous or individual reverse transcription of different genesassociated with cancer, such as B2M and CYP1B1, and optionally anappropriate calibrator mRNA in a single cDNA-synthesis reaction.standards or controls for q-PCR and/or standards or controls for B2M andCYP1B1 expression levels. The kit of the invention can be particularlyuseful for carrying out a variety of highly sensitive real-time PCRs(q-PCRs), thus allowing the quantification of expression levels of thetumor-associated genes, such as B2M and CYP1B1. For example, such a kitcan include reagents for detecting expression levels of cancerassociated genes, such as B2M and CYP1B1 (for example, primers and q-PCRreagents).

Another embodiment of the present invention, the kit can containinstruction and reagents to simultaneously prime the reversetranscription of mRNA from more than one tumor associated genes in asingle cDNA-synthesis reaction. Simultaneous quantification of genes byhighly sensitive (reverse transcriptase PCR, RT-PCR) of the inventioncan reliably convert mRNA to cDNA by reverse transcription withreproducible efficiency.

In yet another embodiment, the kit can be used as a screening kit forpresence of cancer in a sample or a series of samples. The kit canfurther be used as a method for monitoring the progression of a squamouscell neoplasia.

In one aspect of the invention, the kit can be used to determineexpression levels of tumor-associated genes, such as B2M and CYP1B1,which can influence a treatment prescribed by a health care practitioneror other person skilled in the art. The type of treatment can bedetermined by those skilled in cancers and based on the results from thekit which analyzes expression levels of the tumor-associated genes, suchas B2M and CYP1B1. In another embodiment, additional kits can be usedover time to monitor the progression of the squamous cell neoplasia whenover expression of B2M and CYP1B1 is detected. In another embodiment,multiple kits can be used over time to monitor the progression orinhibition of squamous cell neoplasia in response to a treatment with abioactive agent by monitoring expression levels of B2M and CYP1B1.

IV. Isolated Nucleic Acid and Proteins and Detection Methods

One aspect of the invention pertains to extracting nucleic acidmolecules that either themselves are the nucleic acid sequences ofinterest (e.g., mRNA) of the invention, or which encode the polypeptideof the invention, or fragments thereof. As used herein, the term“nucleic acid molecule” is intended to include DNA molecules (e.g., cDNAor genomic DNA) and RNA molecules (e.g., mRNA) and analogs of the DNA orRNA generated using nucleotide analogs. The nucleic acid molecule can besingle-stranded or double-stranded.

The term “extracted nucleic acid molecule” includes nucleic acidmolecules which are separated from other molecules, such as othernucleic acid molecules or cellular debris which can be present within orassociated with cells. For example, with regards to RNA, the term“isolated” includes RNA molecules which are separated from the othernucleic acids which are normally associated with RNA, such as DNA.Moreover, an “extracted” nucleic acid molecule, such as a cDNA molecule,can be substantially free of other cellular material, or culture mediumwhen produced by recombinant techniques, or substantially free ofchemical precursors or other chemicals when chemically synthesized.Nucleic acid molecules can be isolated from a cellular sample throughmeans known by those skilled in the art, such as through cell lysis andprecipitation and/or use of commercial reagents specialized in nucleicacid extraction.

A nucleic acid molecule of the present invention, e.g., a nucleic acidmolecule having the nucleotide sequence of the gene or a portionthereof, can be isolated using standard molecular biology techniques andthe sequence information provided herein. Using all or a portion of thenucleic acid sequence of the gene as a hybridization probe, a gene ofthe invention or a nucleic acid molecule encoding a polypeptide of theinvention can be isolated using standard hybridization and cloningtechniques (e.g., as described in Sambrook, J., Fritsh, E. F., andManiatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed, ColdSpring Harbor Laboratory, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y., 1989).

A nucleic acid of the invention can be amplified and quantified usingmRNA or cDNA generated from mRNA as a template and appropriateoligonucleotide primers according to standard PCR amplificationtechniques and quantitative methods, such as q-PCR. The nucleic acids ofthe invention, moreover, can comprise non-degraded nucleic acidsequences coding for production of beta-2 microgobulin (B2M) andcytochrome p450 1B1 (CYP1B1). In a more refined approach, cDNA copies ofthese mRNAs can be made using reverse transcriptase by methods known tothose skilled in the art. A probe/primer can be generated to a specificportion of the genes to assay non-degrated mRNA, such as at least 500basepairs and preferably at least 1000 basepairs from the encoded 3′ends of the transcripts. The primers can be generated or purchased, asdescribed above, such that they hybridize to at least about 10 to 12,preferably at least 15, found near at least 500 basepairs and preferablyat least 1000 basepairs from the encoded 3′ ends of the transcripts ofthe invention.

In another embodiment, extracted nucleic acids of the invention can beat least 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650,700, 750, 800, 850, 900, 950, 1000, 1100, 1200, 1300, 1400, 1500, 1600,1700, 1800, 1900, 2000 or more nucleotides in length and hybridizes to anucleic acid molecule corresponding to a nucleotide sequence of a gene.

In one embodiment, proteins can be extracted from cells or tissuesources by an appropriate purification scheme using standard proteinpurification techniques. An “extracted” or “purified” protein or portionthereof is substantially free of cellular material or othercontaminating proteins from the cell or tissue source from which theprotein is derived, or substantially free from chemical precursors orother chemicals when chemically extracted. The language “substantiallyfree of cellular material” includes preparations of protein in which theprotein is separated from cellular components of the cells from which itis isolated or recombinantly produced. In one embodiment, the language“substantially free of cellular material” can include preparations ofprotein having less than about 30% (by dry weight) of other proteins(also referred to herein as a “contaminating protein”), more preferablyless than about 20% of other proteins, still more preferably less thanabout 10% of other proteins, and most preferably less than about 5%other proteins. When the protein or portion thereof is chemicallyextracted, it can also be substantially free of chemical used forextraction, i.e., chemical represent less than about 20%, morepreferably less than about 10%, and most preferably less than about 5%of the volume of the protein preparation.

As used herein, a “portion” of the protein includes a fragment of theprotein comprising amino acid sequences sufficiently homologous to orderived from the amino acid sequence of the protein, which include feweramino acids than the full length proteins. Typically, portions of theprotein can comprise a domain or motif with at least one activity of theprotein. A portion of the protein can be a polypeptide which is, forexample, 10, 25, 50, 100, 200 or more amino acids in length. Portions ofthe protein can be used as targets for developing agents which modulateexpression levels of the protein.

The proteins or nucleic acid sequences of the invention can be detectedby any method known to those of skill in the art. A wide variety ofconventional techniques are available, including mass spectrometry,chromatographic separations, 2-D gel separations, binding assays (e.g.,immunoassays), competitive inhibition assays, and so on. Any effectivemethod in the art for measuring the present/absence, level or activityof a protein or nucleic acid sequence is included in the invention. Itis within the ability of one of ordinary skill in the art to determinewhich method would be most appropriate for measuring a specific proteinor nucleic acid sequence. Thus, for example, a ELISA assay may be bestsuited for use in a physicians office while a measurement requiring moresophisticated instrumentation may be best suited for use in a clinicallaboratory. Regardless of the method selected, it is important that themeasurements be reproducible.

Quantification can be based on derivatization in combination withisotopic labeling, referred to as isotope coded affinity tags (“ICAT”).In this and other related methods, a specific amino acid in two samplesis differentially and isotopically labeled and subsequently separatedfrom peptide background by solid phase capture, wash and release. Theintensities of the molecules from the two sources with differentisotopic labels can then be accurately quantified with respect to oneanother. In addition, one- and two-dimensional gels have been used toseparate proteins and quantify gels spots by silver staining,fluorescence or radioactive labeling. These differently stained spotshave been detected using mass spectrometry, and identified by tandemmass spectrometry techniques.

In other preferred embodiments, the level of the proteins or nucleicacid sequences can be determined using a standard immunoassay, such assandwiched ELISA using matched antibody pairs and chemiluminescentdetection. Commercially available or custom monoclonal or polyclonalantibodies are typically used. However, the assay can be adapted for usewith other reagents that specifically bind to the molecule. Standardprotocols and data analysis are used to determine the markerconcentrations from the assay data.

One embodiment for detecting RNA or DNA corresponding to a gene orprotein of the invention can be with the use of a labeled nucleic acidprobe capable of hybridizing to a mRNA or cDNA of the invention.Suitable probes for use in the diagnostic assays of the invention aredescribed herein. A preferred agent for detecting protein is an antibodycapable of binding to protein, preferably an antibody with a detectablelabel. Antibodies can be polyclonal, or more preferably, monoclonal. Anintact antibody, or a fragment thereof (e.g., Fab or F(ab′)₂) can beused.

The term “labeled”, with regard to the probe or antibody, is intended toencompass direct labeling of the probe or antibody by coupling (i.e.,physically linking) a detectable substance to the probe or antibody, aswell as indirect labeling of the probe or antibody by reactivity withanother reagent that is directly labeled. Examples of indirect labelingcan include detection of a primary antibody using a fluorescentlylabeled secondary antibody and end-labeling of a DNA probe with biotinsuch that it can be detected with fluorescently labeled streptavidin.The term “sample” is intended to include tissues, cells and biologicalsamples isolated from a subject (e.g., brush cytology sample), as wellas tissues, cells and fluids present within a subject. That is, thedetection method of the invention can be used to detect mRNA, protein,or cDNA in a sample in vitro as well as mRNA or protein in vivo. Forexample, in vitro techniques for detection of mRNA can include PCR,q-PCR, northern hybridizations and in situ hybridizations. In vitrotechniques for detection of protein can include enzyme linkedimmunosorbent assays (ELISAs), western blots, immunoprecipitations andimmunofluorescence. In vitro techniques for detection of cDNA caninclude Southern hybridizations, PCR, q-PCR (e.g., as described in L.Cseke, et al., Handbook of Molecular and Cellular Methods in Biology andMedicine, 2^(nd) Ed., CRC Press, 2004). Furthermore, in vivo techniquesfor detection of protein can include introducing into a subject alabeled antibody. For example, the antibody can be labeled with aradioactive label whose presence and location in a subject can bedetected by standard imaging techniques.

Measurement of the relative amount of an RNA or protein molecule of theinvention can be by any method known in the art (see, e.g., Sambrook,J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A LaboratoryManual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y., 1989; and Current Protocolsin Molecular Biology, eds. Ausubel et al. John Wiley & Sons: 1992).Typical methodologies for RNA detection include RNA extraction from acell or tissue sample, followed by hybridization of a labeled probe(e.g., a complementary polynucleotide) specific for the target RNA tothe extracted RNA, and detection of the probe (e.g., Northern blotting).Typical methodologies for protein detection include protein extractionfrom a cell or tissue sample, followed by hybridization of a labeledprobe (e.g., an antibody) specific for the target protein to the proteinsample, and detection of the probe. The label group can be aradioisotope, a fluorescent compound, an enzyme, or an enzyme co-factor.Detection of specific protein and polynucleotides may also be assessedby gel electrophoresis, column chromatography, direct sequencing, orquantitative PCR (in the case of polynucleotides) among many othertechniques well known to those skilled in the art.

Detection of the presence or number of copies of all or a part of a geneof the invention may be performed using any method known in the art.Typically, it is convenient to assess the presence and/or quantity of aDNA or cDNA by Southern analysis, in which total DNA from a cell ortissue sample is extracted, is hybridized with a labeled probe (e.g., acomplementary DNA molecule), and the probe is detected. The label groupcan be a radioisotope, a fluorescent compound, an enzyme, or an enzymeco-factor. Other useful methods of DNA detection and/or quantificationinclude direct sequencing, gel electrophoresis, column chromatography,and quantitative PCR, as is known by one skilled in the art.

The proteins or nucleic acid sequences of the invention can be detectedby any method known to those of skill in the art. Primers based on thenucleotide sequence of the genes of the invention can be used to detecttranscripts corresponding to the gene(s) of the invention. In someembodiments, a primer pair can be designed by utilizing primer designsoftware, such as GenScript, Primer3, PRIDE and Primer Express.Commercial primers are also available for purchase corresponding tomultiple locations throughout the gene. In an exemplary embodiment, theprimers can be complementary to an mRNA sequence of at least 15 basesfound at least 500 basepairs and preferably at least 1000 basepairs fromthe encoded 3′ ends of the transcripts, corresponding to thetranscriptional start site. By specifying at least 500 basepairs andpreferably at least 1000 basepairs from the encoded 3′ ends of thetranscripts for amplification, the q-PCR can be biased toward detectingthe expression levels of non-degraded mRNA without interference ofdegraded mRNA that can be extracted from dead cells or cells undergoingapoptosis.

EXAMPLES

This invention is further illustrated by the following examples whichshould not be construed as limiting. The following experiments wereperformed to demonstrate various aspects of the invention.

Example 1 Materials and Methods Oral Carcinogenesis

Dibenzo[a,I]pyrene was applied orally at a level of 0.025 nm three timesa week for 33 weeks to produce floor of the mouth and lateral border oftongue tumors in Golden Syrian Hamster (Mesocricetus auratus). Five of12 animals developed oral squamous cell carcinoma (OSCC) detectable bygross inspection. These were later verified histologically. The averagecross-sectional area of the lesions in these five was 3.2 mm². The firstsamples were taken from these five hamsters 1 month after the end of thecarcinogen exposure (week 37). This was to insure that the observed geneexpression changes were due to longterm changes in the tissue and notdirectly due to the presence of 0.0025 nM dibenz[a,I]pyrene. Eighthamsters treated identically but never exposed to dibenzo[a,I]pyrenewere used as the source of control tissue. All procedures were carriedout within the guidelines of the Animal Research Committee at theUniversity of Illinois at Chicago.

Cell and Tissue Acquisition

For brush cytology, a Cytosoft' brush (Cytology Brush, Medical PackagingCorp., Camarillo, Calif., USA) was used to harvest oral keratinocytesfrom the mucosa, between 2:00 and 3:30 PM, on three consecutive weeks(weeks 37, 38, and 39). Twenty back and forth brushing motions wereused. No trauma to the mucosa was noted. Brush oral cytology was appliedto oral carcinoma and non-oral carcinoma sites. On the 40^(th) week,normal and tumor-bearing mucosa was surgically removed followingasphyxiation with bottled carbon dioxide.

Histopathology to Identify Oral Cancer

Tissues were processed, embedded, and sectioned at 5 um. Sections werestained using hematoxylin and eosin using an automated autostainer(Leica Microsystems, Bannokbum, Ill., USA) and evaluated using standardcriteria.

Immunohistochemistry

Cells were fixed in 2.5% formalin overnight then subjected toimmunofluorescent staining using pancytokeratin-specific antibodies,clones: AE-1 and AE-3 (ab961) (AbCam, Cambridge, UK) as directed. TheVentana HX system (Ventana, Yokohama, Japan) was used to perform theimmunofluorescent staining according to the manufacturer's protocol withstandard enzymatic antigen retrieval. Tissue was treated identicallyexcept it was fixed in 10% formalin overnight and imbedded in paraffinprior to sectioning.

RNA Extraction

Following brush cytology cell collection the brush was immersed inTrizol (Invitrogen, Carlsbad, Calif., USA), vortexed and then frozen at) 70° C. On thaw, the sample was vortexed, and then subjected tostandard RNA isolation, followed by DNAse 1 treatment with the AurumTotal RNA Mini-kit as described by the manufacturer (Bio-Rad, Hercules,Calif., USA). cDNA synthesis was performed with ⅓ of the total sample ofRNA, using random hexamers and Superscript III RT enzyme (Invitrogen). Asimilar process was used for the isolation of RNA from tissue, exceptmechanical homogenization was required in Trizol (Invitrogen).

Quantitative Real-Time q-PCR

Quantitative real-time q-PCR was carried out using the iCycler iQ(Bio-Rad) and SYBR Green fluorescence to detect double-stranded DNA.Values were normalized to the best controls, succinate dehydrogenasecomplex A (SDHA) and glyceraldehyde-3-phosphate dehydrogenase (GAPD) forbrush cytology samples and cyclophilin A (PPIA) and beta-actin (ACTB),for tissue and brush cytology samples together. The quality of the RNAwas judged to be satisfactory based on the fact that q-PCR with PPIAprimer sets with different product sizes (120, 150, and 182 nucleotides)all gave similar results. Negative controls were without reversetranscriptase for cDNA synthesis. Amplicon sizes for primer pairproducts were validated using standard q-PCR with agarose gel ethidiumbromide visualization. The results are reported as mean values from 3 to6 separate samples. All PCR runs included a reference cDNA to allow thecomparison of expression levels of samples tested at different times.Primer sets used included: forward primer hamster B2M (3′AGTTTGTACCCACTGCGACTGA 5′) (SEQ ID NO.: 3); reverse primer hamster B2M(3′ TGCTGCTGTGTGCATAGACTGA 5′) (SEQ ID NO.: 4); forward primer human B2M(3′ TGTGCTCGCGCTACTCTCTCTTT 5′) (SEQ ID NO.: 5); reverse primer humanB2M (3′ ATGTCGGATGGATGAAACCCAGAC 5′) (SEQ ID NO.: 6); forward primerhamster CYP1B1 (3′ GAATCCATGCGCTTCTCCAGCTTT 5′) (SEQ ID NO.: 7); reverseprimer hamster CYP1B1 (3′ TCCAGGAATCGGGCTGGATCAAAT 5′) (SEQ ID NO.: 8);forward primer human CYP1B1 (3′ GCCTCATTATGTCAACCAGGTCCA 5′) (SEQ IDNO.: 9), and reverse primer human CYP1B1 (3′ AAGCCAGGTAAACTCCAAGCACCT5′) (SEQ ID NO.: 10).

Determination of Endogenous Controls for mRNA Levels of Brush OralCytology Harvested RNA

Direct analysis for RNA concentrations in brush cytology samples wasimpractical because of the low amounts (estimated at 20-200 ng), so theidentification of reference genes to control for the mRNA levels was ofgreat importance, see Table 1 (Sample 1 and 2 from Patient 1 and SampleA, B and C from Patient 2). Potential housekeeping genes for thispurpose were identified based on their constant expression in manytissues or on consistent levels in normal and tumor tissue of thegastrointestinal tract based on data contained at the SAGEmap site ofthe Cancer Genome Anatomy Project (http:// www.ncbi.nlm.nih.gov/SAGE).

Of the candidates, cDNA sequences for four (ACTB, GAPD, PPIA, and SDHA)were available in the Syrian Golden Hamster database, and a fifth,GSTP1, was added based on our observation that its expression wassimilar on average in tumor and normal oral mucosa (see FIGS. 8A-8F). Wedetermined the expression level of these genes in brush cytology samplesfrom eight examples of normal tissue and 10 examples of tumor tissue. Weused the NORMFINDER program to identify the optimal control(s). Thisprogram determined which gene(s) varied minimally in expression levelswhen compared to average expression of the other potential referencegenes. For tumor and control brush cytology samples (as in FIGS. 7A-7F)the geometric mean of the SDHA and GAPDH levels was identified as anoptimal internal standard. Analogously, for tumor and control RNA fromcytology and tissue biopsy samples together (as in FIGS. 8A-8F), thegeometric mean of ACTB and PPIA levels was an optimal control.

TABLE 1 Proportion of undegraded Human beta actin mRNA in differentsamples Sample 1 Sample 2 Sample A Sample B Sample C Product- .0072 .042.016 1.8 1 5′ Product- .45 1.2 .186 1.8 3 3′ 5′/3′ .016 .034 .086 1 .33

Statistical Analysis

The data presented are mean ±SD unless otherwise stated. For statisticalcomparison of RNA levels between the control and tumor groups theStudent's t-test was used. Results were considered statisticallysignificant if the two-tailed P-values were <0.05. Analysis of variance(ANOVA) was used for the determination of the intraclass correlation(ICC) for repeated tests on the same hamster (FIGS. 7A-7F).

Example 2 Experimental Results Reliability of Quantitation of BrushCytology Sample RNA

One month after the end of the dibenzo[a,I]pyrene exposure, brushcytology samples were harvested on three consecutive weeks from diseasedand control unexposed hamsters (FIGS. 6A and 6B). RNA was purified andsubjected to real-time q-PCR analysis (FIGS. 7A-7F). We used these twosources of cells (tumor epithelium and control mucosa) to increase theprobability that specific RNA expression levels would vary among thedifferent animals.

The bar graphs in FIGS. 7A-7F show the measured level for each RNA ofinterest and allows an analysis of the reliability of the methodologydescribed here. In addition to the tumor-associated genes, expression ofthe endothelial cell marker PECAM1 was also measured. The ICC wascalculated as a measure of the degree of similarity between measurementscarried out at different times for the same animal. It is compared tothe degree of similarity of measurements for the different animals.While there was substantial lack of similarity for the weeklymeasurements on the same animal for some mRNAs, for three there was arelatively large ICC (FIGS. 7A-7F), verifying that there was substantialreproducibility in the measurement method. Nevertheless, it was clearthat multiple samples would be necessary for the greatest accuracy. Wealso note that for three of six mRNAs (B2M, CYP1B1, and PECAM1), therewere significant differences in expression levels in tumor vs. controlsamples (see Table 2).

TABLE 2 Comparison of mRNA levels in control vs. tumor in samplesacquired by different methods Method of cell acquisition Gene Brushcytology Tissue biopsy B2M Control 1.08 ± 0.111  4.00 ± 0.322 Tumor 2.59± 0.446  4.92 ± 0.576 P-value 0.00271 0.158 CDK2AP1 Control 1.55 ± 0.168 7.11 ± 0.820 Tumor 0.709 ± .0847   5.31 ± 0.632 P-value 0.0643  0.149CYP1B1 Control 13.2 ± 2.89  4.99 ± 0.96 Tumor 4.65 ± 1.42  1.22 ± 0.35P-value 0.0154   0.0123 GSTP1 Control 3.99 ± 0.419 0.50 ± 0.06 Tumor2.98 ± 0.750 0.51 ± 0.14 P-value 0.862  0.941 PECAM1 Control 0.447 ±0.0781   13.3 ± 0.1.88 Tumor   1.10 ± 0.202^(a)  8.01 ± 0.938 P-value0.00129  0.0572 VEGF Control   3.34 ± 0.478^(a) 15.3 ± 2.84 Tumor 2.43 ±0.497 17.6 ± 2.12 P-value 0.381  0.520 ^(a)A two-tailed Student's t-testwas used to compare the statistical significance of the differences inmRNA levels of control mucosa and tumor.Comparison of mRNA Levels in Brush Cytology and Tissue Biopsy Samples

It was then tested whether tumor-associated changes in the level of aspecific mRNA in brush cytology samples would also be observed in RNAfrom tissue biopsies from the same animals. One week after the lastcytologic sample was taken, the animals were killed and tissue fromtumor and normal areas was taken by dissection to produce a tissuebiopsy sample. To allow a comparison of RNA from all four sample types,RNA quantities were normalized to internal standards, PPIA, and BACT. Anaverage for the three brush cytology samples is represented in the bargraph and is plotted next to the value obtained from the RNA fromsurgically biopsied tissue from the same animal (FIGS. 8A-8F).Surprisingly, the results were to some degree dependent on the samplingmethod. First, we note that there was minimal correlation betweenrelative mRNA levels in brush cytology samples and surgical biopsysamples from the same animal (FIGS. 8A-8F 3). Secondly, in the samefigure it is demonstrated that the levels of specific mRNAs depend onthe sample type. Thirdly, only one of six genes, CYP1B1 showed a changein expression with tumor formation in the surgically excised tissue(Table 1). Specifically, CYP1B1 showed increased expression in the earlytimepoints and decreased expression in the later timepoints. Incontrast, CYP1B1 and two other genes showed changes in the brushcytology samples with tumor formation. Brush cytology mRNA quantitationwas reproducible but different from tissue biopsy mRNA. One simpleexplanation would be that brush cytology RNA was derived from differentcells than that of the tissue biopsy RNA.

Brush Cytology Sample RNA Was Highly Enriched for Epithelial Markers

To determine the identity and purity of brush cytology cells wesubjected normal tissue sections to immunofluorescence analysis ofepithelial cytokeratins. A control experiment showed high levels ofexpression in the epithelium but not in the dermis (located below thebasement membrane) of an immunostained section of biopsied tissue (FIG.9A). In the brush cytology sample over 95% of cells contained highlevels of these proteins (FIG. 9B). Further, RNA from five differentbrush cytology samples from control hamsters was compared to RNA fromtissue biopsy samples from the same hamsters. In the brush cytologysample RNA epithelial cell markers, E-cadherin and connexin-26 (CADH1and CX26), were enriched, while desmin (DES), a muscle cell marker, andvimentin (VIM), a marker for mesenchymally derived cells, were depressed(FIG. 9C). This is consistent with the brush cytology sample beinggreatly enriched for mucosal epithelial cells compared to the tissuebiopsy cells.

While the present invention has been described in terms of specificmethods, structures, and devices it is understood that variations andmodifications will occur to those skilled in the art upon considerationof the present invention. For example, the methods and compositionsdiscussed herein can be utilized beyond the preparation of metallicsurfaces for implants in some embodiments. As well, the featuresillustrated or described in connection with one embodiment can becombined with the features of other embodiments. Such modifications andvariations are intended to be included within the scope of the presentinvention. Those skilled in the art will appreciate, or be able toascertain using no more than routine experimentation, further featuresand advantages of the invention based on the above-describedembodiments. Accordingly, the invention is not to be limited by what hasbeen explicitly shown and described.

All publications and references are herein expressly incorporated byreference in their entirety. The terms “a” and “an” can be usedinterchangeably, and are equivalent to the phrase “one or more” asutilized in the present application. The terms “comprising,” “having,”“including,” and “containing” are to be construed as open-ended terms(i.e., meaning “including, but not limited to,”) unless otherwise noted.Recitation of ranges of values herein are merely intended to serve as ashorthand method of referring individually to each separate valuefalling within the range, unless otherwise indicated herein, and eachseparate value is incorporated into the specification as if it wereindividually recited herein. All methods described herein can beperformed in any suitable order unless otherwise indicated herein orotherwise clearly contradicted by context. The use of any and allexamples, or exemplary language (e.g., “such as”) provided herein, isintended merely to better illuminate the invention and does not pose alimitation on the scope of the invention unless otherwise claimed. Nolanguage in the specification should be construed as indicating anynon-claimed element as essential to the practice of the invention.

1. A method for detecting the likelihood that a subject has squamous cell carcinoma, comprising: obtaining a brush cytology sample from a subject, extracting nucleic acids from cells in the sample, and assaying the nucleic acids for expression levels of non-degraded nucleic acid sequences coding for production of beta-2 microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1), wherein over-expression of the B2M gene compared to a standard, together with under-expression of the CYP1B1 gene compared to a second standard is indicative of a likelihood that the subject has squamous cell carcinoma.
 2. The method of claim 1, wherein the step of obtaining a sample further comprises obtaining oral squamous cells.
 3. The method of claim 1, wherein the step of obtaining the sample further comprises at least 20 brush strokes.
 4. The method of claim 1, wherein the step of obtaining the sample further comprises taking 2 to 5 initial brush strokes to prime the surface, followed by at least 20 brush strokes to obtain the sample.
 5. The method of claim 1, wherein the step of assaying the nucleic acids further comprises amplifying and quantifying expression of the B2M gene and the CYP1B1 gene by real time polymerase chain reaction (q-PCR) using primers complementary to an mRNA sequence of at least 15 bases located at least 500 basepairs from encoded 3′ ends of B2M and CYP1B1 transcripts.
 6. A method for detecting the likelihood that a subject has squamous cell carcinoma, comprising detecting beta-2 microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1) protein or nucleic acid expression levels in a sample from the subject, wherein over-expression of the B2M gene compared to a standard together with under-expression of the CYP1B1 gene compared to a second standard is indicative of a likelihood that the subject has squamous cell carcinoma.
 7. A method for detecting the likelihood that a subject has a precancerous squamous cell disorder, comprising detecting beta-2 microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1) protein or nucleic acid expression levels in a sample from the subject, wherein over-expression of the B2M gene compared to a standard together with over-expression of the CYP1B1 gene compared to a second standard is indicative of a likelihood that the subject has a precancerous squamous cell disorder.
 8. A method for monitoring squamous cell neoplasia in a human subject over time, comprising: obtaining a brush cytology sample from a subject at a first time, extracting nucleic acids from cells in the sample, assaying said nucleic acids for the expression level of genes coding for the production of beta-2 microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1), and repeating the steps of obtaining a sample, extracting nucleic acids and assaying for expression levels of B2M and CYP1B1 at a later time, wherein increased expression of the B2M gene at a later time or decreased expression of the CYP1B1 gene at a later time is indicative of progression of neoplasia.
 9. A method for detecting the likelihood that a subject has a precancerous squamous cell disorder, comprising: obtaining a brush cytology sample from a subject, extracting nucleic acids from cells in the sample, and assaying the nucleic acids for expression level of non-degraded nucleic acid sequences coding for production of beta-2 microgobulin (B2M) and cytochrome p450 1B1 (CYP1B1), wherein over-expression of the B2M gene compared to a standard together with over-expression of the CYP1B1 gene compared to a second standard is indicative of a likelihood that the subject has a precancerous squamous cell disorder.
 10. The method of claim 9, wherein the brush cytology sample comprises squamous cells.
 11. The method of claim 9, wherein the step of assaying the nucleic acids further comprises amplifying and quantifying expression of the B2M gene and the CYP1B1 gene by real time polymerase chain reaction (q-PCR) using primers complementary to an mRNA sequence of at least 15 bases located at least 500 basepairs from encoded 3′ ends of B2M and CYP1B1 transcripts.
 12. The method of claim 9, wherein the step of obtaining the sample further comprises at least 20 brush strokes.
 13. A kit for assessing the presence of cancer in a sample comprising: a pair of primers which specifically hybridize to at least one non-degraded nucleic acid sequences coding for production of beta-2 microglobulin (B2M) gene or a cytochrome p450 1B1 (CYP1B1) gene; and, reagents for real-time polymerase chain reaction (q-PCR).
 14. The kit of claim 13 according to the method of claim
 1. 15. The kit of claim 13 according to the method of claim
 9. 16. The kit of claim 13, further comprising a brush to obtain a brush cytology sample.
 17. The kit of claim 13, further comprising a nucleic acid extraction reagent.
 18. A method of treating or inhibiting an squamous cell neoplasia, the method comprising administering a bioactive agent that inhibits cytochrome p450 proteins to the subject with the squamous cell neoplasia, wherein the bioactive agent inhibits at least cytochrome p450 1B1 (CYP1B1), cytochrome p450 1A1 (CYP1A1) and combinations thereof. 